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Abstract 

This longitudinal case study tracks an adult second-language (L2) learner’s quality and 
quantity of encounters with 20 vocabulary items in an English for Academic Purposes 
course over 3 months. The learner completed pretest and posttest vocabulary knowledge 
interviews, submitted course materials and notes for analysis, and was observed during 
class lessons. The results show that frequency of encounters contributes more to 
vocabulary learning than contextual richness does. In addition, the case study data 
illustrate the highly incremental nature of L2 vocabulary acquisition in a naturalistic 
context. 

Keywords', longitudinal, case study, vocabulary frequency, vocabulary depth, contextual richness, 
generative processing 


A common concern among teachers and learners in intensive English for Academic Purposes 
(EAP) programmes is the extent to which previously learnt, known and new vocabulary items 
encountered in text are available for subsequent use in the immediate and longer term. Eeamers 
who are faced with reading and understanding academic texts are often frustrated by their 
inability to retrieve words or their meanings on demand. 

Given the incremental nature of vocabulary acquisition, longitudinal studies that track learners’ 
encounters with words in particular contexts can provide insights into precisely how learners 
approach vocabulary learning, both within and beyond language classes. They can also reveal the 
types of vocabulary encounters that are likely to contribute to long-term retrieval of word form 
and meaning and to productive use 

This case study illustrates the role of three contributing factors to vocabulary learning: quality of 
input, quality of output, and frequency of occurrences with target vocabulary items. While there 
are studies that investigate one or two of these factors, to my knowledge, there are no studies that 
investigate all three, drawing on both quantitative and qualitative data. 

Eollowing Paul Nation’s tenet of adopting a rigorous approach to research design (see Nation & 
Webb, in press), this naturalistic classroom research, consisting of a single case study. 


http://nflrc.hawaii.edu/rfl 
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triangulates quantitative and qualitative data at different points to better understand how 
frequency, contextual richness, and generative use enhance vocabulary learning. It also reflects 
his interest in gaining insights from more fine-grained single cases to complement larger-scale 
experimental studies. Moreover, it draws on an area of scholarship within which Paul Nation has 
contributed enormously: the necessary conditions for vocabulary acquisition. 


Quality of Input 

Contextual richness can contribute to vocabulary acquisition, but, on its own, does not appear to 
be a sufficient condition for later recall of word forms and meanings. Studies promoting the use 
of rich, clear contexts for vocabulary acquisition (e.g., Schouten-van Parreren, 1989) are 
tempered with studies claiming that there may be greater benefits for reading comprehension 
than vocabulary learning (e.g., Mondria & Wit-de Boer, 1991; Parry, 1991). In a study 
comparing the quality (richness of context) and quantity of vocabulary encounters in input, Zahar, 
Cobb, and Spada (2001) found no consistent pattern showing that rich, directive contexts led to 
greater vocabulary growth. On the contrary, they found highly variable contexts to be favourable. 
They suggest it is likely that variability enhances vocabulary learning because it exposes learners 
to a wide range of natural contexts in which words can occur. Opaque, unclear contexts might 
trigger learners to notice the word, piquing learners’ curiosity and paving the way for close 
selective attention to the word in clear contexts met in the future. Similarly, Haastrup (1989, pp. 
319-320) has pointed out that puzzling over problems with word meanings in context involves 
greater cognitive engagement, which helps subsequent recall both in highly variable and in rich, 
clear contexts. It is of interest to investigate whether or not target words that are learned better in 
an EAP programme have been embedded in clear contexts with relevant clues to the word 
meanings. 


Quality of Output 

Studies from cognitive psychology on depth of processing and memory (Craik & Lockhart, 1972) 
have shown that long-term retention is influenced by the level at which information is processed. 
Processing begins at shallow sensory levels, such as how a word is pronounced, and progresses 
to deeper levels, such as analyzing the meaning of words and relating this to stored knowledge in 
memory, thus leaving a more permanent memory trace. 

Taking this theory further, Craik and Tulving (1975) maintained that elaborative processing 
enhanced long-term retrieval by strengthening the memory trace. In other words, actively 
generating new information by connecting new and known information enriches semantic 
networks. The critical point is that these richer connections set up much more distinct and 
“discriminable” memory traces than those items that are not elaborated (Baddeley, 1990, p. 170). 
These, in turn, enhance later recognition and recall (Baddeley, 1998). 

Based on this framework, a number of studies investigating depth of cognitive processing and 
second language (L2) vocabulary acquisition have found that semantic elaboration indeed 
enhances later recall. Joe (1998) compared two groups of learners who completed a read and 
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retell task. One group had access to the text during the retelling whereas the other did not. Using 
a generative knowledge scale, measuring different levels of semantic elaboration, she found that 
those who made the greatest vocabulary learning gains in a posttest were those who retrieved and 
used the target words in novel ways from the original text. 

Barcroft (2004) conducted an experiment comparing a group of learners who wrote sentences 
using new word forms with a group of learners who only viewed the words. He argued that 
semantic elaboration has a facilitative effect for partially known but not completely new words, 
mnemonic techniques aside. He further argued that recognition tasks where word forms are 
provided can facilitate acquisition, but semantic elaboration tasks drawing on learners’ own 
knowledge do not. 

Cognisant of the fact that the quality of learning, or depth of processing, is but one factor 
affecting vocabulary learning, Laufer and Hulstijn (2001) proposed an involvement load 
hypothesis, combining three factors: need (motivation), search (use of references), and evaluation 
(appropriate use of a word in context). Hulstijn and Laufer (2001) applied their construct to two 
groups of advanced university students in Israel and the Netherlands who were studying English 
as a foreign language (EEL). They compared the involvement load and the amount of incidental 
vocabulary learning from three different task conditions: (a) reading comprehension with the aid 
of marginal glosses, (b) reading comprehension and gap-fill exercises, and (c) writing a letter to 
an editor integrating the target words. As proposed, they found that the amount of vocabulary 
learning was determined by cognitive engagement with particular tasks. These studies point to 
the facilitative effect of deep-level semantic processing. 


Frequency of Occurrence 

Although explicit, elaborative learning at the semantic level is crucial for the maintenance of 
vocabulary knowledge, implicit (subconscious) knowledge and explicit attention to word form 
also contribute to long-term retention. Baddeley (1990, pp. 160 & 172) maintained that rote 
rehearsal alone is not as effective as deeper elaborate rehearsal for storing and processing 
knowledge, but substantial quantities of rote rehearsal may activate existing lexical items, 
thereby facilitating subsequent word recognition. Many studies on acquiring vocabulary 
incidentally and incrementally through reading refer to this explanation to account for 
vocabulary gains from frequent exposures to words (e.g.. Beck, Perfetti, & McKeown, 1982; 
Brown, 1993; Krashen, 1989; Nagy, Herman, & Anderson, 1985; Rott, 1999). N. Ellis (1995) 
suggested that repeated exposure to the regularities of words’ surface phonological and 
orthographical features in spoken and written input helps learners to recognise and produce those 
forms subsequently. 

In an experimental study investigating the contribution of phonological repetition in long-term 
memory, Ellis and Sinclair (1996) argued that rehearsing aloud sequences of phonemes in a 
foreign or second language helps to establish regular language patterns that are abstracted and 
stored for later reference. Subsequent exposures to these familiar words and word sequences 
serve to consolidate their long-term representation both receptively and productively. Conversely, 
the greater the exposure to possible word sequences, and increased long-term storage of these. 
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the greater the likelihood of those words being accessed automatically. In turn, the greater 
capacity for freed-up attentional resources in short-term memory allows for more complex 
language processing (see R. Ellis, 2002). 

These studies suggest that conscious effort to learn the semantic and conceptual aspects of words, 
using deep and elaborative processes, is needed to prevent attrition, and repetitive exposure to 
the form of words is required to establish words’ surface features. How frequently should 
learners encounter words, though? 

Experimental research indicates that distributed practice of words over a number of days is 
preferable to massed practice, or exposure to words in fewer periods but in rapid succession 
(Baddeley, 1990, p. 173; Dempster, 1987; Eaufer & Osimo, 1991; Mondria & Mondria-de Vries, 
1994). Research tracking long-term retention of words from repeated readings of simplified 
novels in case studies (Horst, 2005; Horst & Meara, 1999) or intact classes (e.g., Cho & Krashen, 
1994; Taguchi, Takayasu-Maass, & Gorsuch, 2004) has shown how repeated exposure to words 
within these reading books enhances vocabulary acquisition. However, because these studies 
have been conducted under controlled conditions with the manipulation of discrete independent 
variables, it is difficult to know to what extent they apply to naturalistic conditions where 
learners hear or see words in multiple data sources over time. 

Taking quality of input or output and frequency together, these studies indicate that all three 
aspects contribute to long-term vocabulary acquisition. However, the role of frequency appears 
to be most important. Receptive and productive knowledge of a word involves attention to its 
forms, meanings, and uses in a range of contexts (Nation, 2001, p. 27). Without exposure, it 
would not be possible to develop these different dimensions of vocabulary knowledge. 

Experimental research on implicit learning (N. Ellis, 1995) has suggested that repeated exposure 
to words’ formal features in input is crucial if words are to be established in learners’ lexicons. 
Eurthermore, in a study of incidental reading that compared frequency and contextual richness, 
Zahar et al. (2001) came to the conclusion that vocabulary acquisition was a function of 
frequency. Eaufer and Hulstijn (2001) also suggested that frequency be considered alongside 
depth of processing when investigating vocabulary growth. What is of interest in this study is 
whether the overriding importance of frequency is corroborated in research where learners are 
tracked during their regular course of study, without any instructional intervention. The main 
question and secondary questions investigated in this case study are listed below. 

1 . The main question: Are words that are encountered frequently learned better, 
irrespective of the richness of context and the type of cognitive processing? 

2. The secondary questions: (a) How many encounters with target words are needed to 
shift them from one state of vocabulary knowledge to another? (b) Are words embedded 
in rich, clear contexts learned better? (c) Is evidence of greater depth of processing 
associated with greater vocabulary development? 
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Method 

This case study of a single learner is derived from a larger study investigating the quality and 
frequency of four L2 learners’ encounters with vocabulary as they studied in an academic 
English preparation course over 3 months. It draws on both quantitative and qualitative data to 
illustrate actual target vocabulary encounters in the normal course of instruction. The study 
focuses on three aspects: (a) how often words were encountered, (b) the depth of cognitive 
processing evident in tasks involving output, and (c) the richness of written and aural contexts in 
which words were embedded. 

In any intensive programme, it is impossible to track all data sources and use of target words 
because of the unacceptably high level of intrusion. A more realistic approach involved targeting 
vocabulary use at the beginning, middle, and final weeks of the course. To capture the 
distribution of target items over the course, this study examined the total number of days that 
learners encountered the words, as well as actual encounters with the words on particular dates. 

To assess learners’ quality of cognitive engagement with and opportunities to encounter new 
vocabulary, four data gathering procedures were employed throughout the course: (a) collection 
of written texts from learners and teachers, (b) non-participant classroom observations, (c) semi- 
structured interviews about vocabulary learning practices, and (d) structured pretest and posttest 
interviews. 

The design of the study involved non-participant observations of learner interactions on 1 day 
each week throughout the course. There were also daily observations of the full class programme 
for 1 week at the start (Week 2), middle (Week 6) and end (Week 10) of the course. A key 
purpose was to record particular sources of target vocabulary use in the classroom. Vocabulary 
knowledge interviews to test target words were conducted in Weeks 2, 3, 6, 7, and 12. Learners 
submitted class-related and independent language learning materials at twice-weekly intervals. 

Throughout the programme learners received a total of 25 hours of content-based, integrated- 
skill instruction from two teachers each week. Three key components of the course included (a) 
studying 40 words for personalized weekly vocabulary tests, (b) using theme booklets as a basis 
for all integrated- skill work in class, and (c) completing an oral and written news log about one 
issue throughout the course. 

Participant 

The participant in this case study, Zeki, was a married, 23-year-old student from Turkey who had 
lived in New Zealand for 14 months. He had completed two-thirds of an economics degree in 
Turkey but had not studied English previously. During the main study, he was enrolled in his 
second 14-week LAP course at a New Zealand university and was aiming to embark on 
undergraduate courses in economics, politics, and history. 

Zeki placed into the highest-level class. A diagnostic measure of receptive vocabulary 
knowledge, the Vocabulary Levels Test (Nation, 1990), indicated that he knew about two-thirds 
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of the second thousand most frequent words of English and about half of the third thousand and 
University Word List items. 

Target Words 

To investigate different encounters Zeki had with vocabulary in the long term, attempts were 
made to choose words that received different amounts of processing as a result of different task 
demands and task purposes. 

Zeki was tested on a total of 74 words. Thirty words were chosen from study themes, which the 
teacher planned to cover in class, and 44 words were chosen from Zeki’s own in-class or out-of- 
class language use or learning materials. Of the 30 words selected from the class themes, 15 
words were tested in Week 2 and the other 15 in Week 6. The 15 theme words were either (a) 
met incidentally in reading and listening or (b) considered central for comprehending key 
concepts and worth studying because of their high frequency and wide application to a variety of 
contexts. The class teacher and a student from a previous course were asked to read the themes 
and compile a list of 10 words considered to be central, probably unknown, and worthwhile for 
future purposes. The researcher, teacher, and student compared lists and agreed on which words 
to include. Another five words were selected in case Zeki already knew the precise meanings of 
words presented previously, and could use them accurately and appropriately in a sentence. 

Zeki’s 44 individualised target words were tested in two sets: one set of 22 words in Week 3 and 
another set of 22 in Week 7. Each set was divided into the following six categories: (a) 5 words 
from individualised vocabulary lists, (b) 5 words from Zeki’s own class study notes, (c) 3 words 
from Zeki’s writing, (d) 3 words from Zeki’s speaking in class, (e) 3 words from Zeki’s listening 
in or out of class, and (f) 3 words from Zeki’s independent reading. 

Time constraints meant that only a small number of words could be administered in the posttest 
in Week 12 of the course. Words that were either known well prior to the course or that did not 
arise in class tasks were discarded. In the end, a total of 20 partially known or unknown words 
were included (see the Results and Discussion section). The words fell into three main categories: 
(a) 3 words used as part of a task sequence with guided teacher input such as dictoglosses or 
tasks involving reading comprehension and discussion; (b) 4 words used in tasks receiving less 
teacher intervention and requiring learners to take more responsibility such as essay writing, 
direct study for vocabulary tests or news logs; and (c) 13 words encountered incidentally in 
reading or listening. 

Procedure and Measures 

A semi-structured interview format provided Zeki with opportunities to demonstrate knowledge 
and use of the 20 target words met during the course. Drawing on the importance of using 
multiple sensitive vocabulary measures (Joe, 1995; Nagy, Herman, & Anderson, 1985; Nation, 
2001, p. 361), five measures were developed. Three measures elicited different aspects of word 
knowledge or use: (a) knowledge of a word’s form and meaning, (b) knowledge of a word’s 
associates, and (c) the ability to generate a sentence accurately and appropriately using the target 
word. A fourth measure assessed how precisely a meaning of a word could be inferred from 
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contextual clues or how precisely a word’s meaning could be linked to existing word knowledge 
by using target words in novel ways. 

Scripted prompts were used to probe for more elaborate word meanings or illustrative sentences 
when these were not forthcoming. The interview proceeded along the following lines. Zeki was 
told that the focus of the interview was on testing his knowledge of vocabulary. He was then 
asked to define the target word provided (“What does this word mean?”) and to use the word in a 
sentence (i.e., “Can you use that word in a sentence?”). The first time he completed the tests, he 
practised the procedure on two known words. 

At the end of the productive test, Zeki completed the fifth measure: a word recognition task. This 
was developed to tap partial knowledge of word meanings and associations that he was unable to 
express during the interview phase. ^ 

Vocabulary knowledge scale. The three vocabulary knowledge and use measures were adapted 
from the Vocabulary Knowledge Scale (VKS), developed by Wesche and Paribakht (1996). The 
VKS is a single progressive rating scale designed to identify five incremental stages of 
vocabulary knowledge. One problem of the VKS is that each step of the uni-dimensional scale 
lacks precision and detail. To overcome this problem, multiple scoring scales to measure 
different dimensions of declarative word knowledge and use were devised. 

Table 1 shows the first measure, which assessed knowledge of form and meaning. It takes the 
first four descriptors from Read’s (1994) adapted VKS scoring scale to measure word knowledge 
and recognition of word form. 

Table 1. Knowledge of form and meaning 
Score Interpretation 

0 The word is not familiar 

1 The word is familiar but the meaning is not known 

2 One meaning of the word is partly known 

3 One meaning of the word is known 

4 A second meaning of the word is partly known 

5 A second meaning of the word is known 


The second measure assessed learners’ ability to produce word associates for target items in 
phrases. This is shown in Table 2. 

Table 2. Ability to produce word associates 

Score Interpretation 

0 No evidence of ability to use the target item in context 

1 Attempts to use associates that are not plausible 

2 Can use plausible associates for one meaning within one context 

3 Can use plausible associates for one meaning in more than one context 

4 Can use plausible associates for a second word meaning 
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The third measure assessed the ability to produce a well-formed, plausible sentence. Like the 
word associates scale, grammatical accuracy and plausibility were rated along five points. The 
scoring is presented in Table 3. 

Table 3. Ability to use words in context 

Score Interpretation 

0 No evidence of ability to use the target item in context 

1 Partial evidence of ability to use tbe target item in context 

2 Can use the target item in a plausible sentence with reasonable accuracy 

3 Can use the target item in a plausible sentence with a high level of accuracy 

4 Can use a different word form in a plausible sentence with reasonable accuracy 


Scoring 

Scoring procedures were adapted from a written version of the VKS used by Scarcella and 
Zimmerman (1998) with tertiary level learners of English as a second language in an academic 
writing programme. Four modifications were made. First, because an oral interview format was 
used, where learners were probed when answers needed clarification, the principle of giving full 
credit for unclear responses was less applicable; when an ambiguous sentence was presented, 
credit was apportioned according to the response. Full credit was not automatic. Second, while 
Scarcella and Z immerman rated spelling in their writing samples, spelling was not tested in the 
oral interview. Third, partial credit was awarded when learners clearly demonstrated general 
understanding of the word but supplied implausible sentences because of confusion with 
similarly related words. Fourth, when learners supplied more than one illustrative sentence, with 
one being highly plausible and accurate, and a second being implausible or plausible but less 
accurate, then, credit was given for the most plausible and accurate. Five principles were applied: 

1. Give credit when grammatical errors do not relate to the target word (e.g., “The 18 
years old guy offended the girl who was walking . . . .”). 

2. Give a score of 0 for errors indicating that learners do not have any knowledge of the 
meaning of the word (e.g., “I distress the man who disturb me,” where distress was 
defined as “dislike.”). 

3. Give a score of 1 for errors indicating that learners have partial knowledge of the word 
meaning but have confused its use with a closely related word (e.g., “Go away or expel. 
I’m not sure exactly. I can say he was dispelled by his boss.”). 

4. Give credit to words that are changed to a different part of speech and are used 
correctly at Fevels 1, 2, and 3 (e.g., “The school deter the children from doing wrong 
thing.” [Target word is deterrent.] Word associates = 2, Use in context = 2). 

5. Give credit for incomplete sentences as long as the learner indicates they know how to 
use the word. 
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Measuring contextual richness and generation. Finally, a fourth measure was designed to assess 
both the level of generative processing^ evident in learner output and the extent to which word 
meanings were explicitly stated or could be inferred from oral and written contexts. Since this 
paper focuses on the quality compared with the quantity of encounters, it was important to see 
whether words that were learned better were in fact embedded in richer written or oral contexts 
or if they were more likely to have been learned better because of frequent encounters across the 
course. It was also important to compare the level of generative processing with the frequency of 
encounters. 

The level of generativeness scale (Joe, 1995) was employed to measure the level of generative 
processing evident in learner production. This scale made incremental distinctions between 
words used productively without any modification from the original source up to words used 
creatively in form and meaning from the original. 

Categories devised by Beck, McKeown, and McCaslin (1983) were used as the basis for 
measuring the level of contextual support for unknown items in oral and written contexts. They 
focused on the type of contextual support available in a text from which a reader could infer an 
unknown word meaning. 

As shown in Table 4, a single rating scale was used to combine both the levels of contextual 
richness and learner-generated output. In the scale, levels of “use” refer to productive output or 
level of generative use, and levels of “context” refer to the richness of the listening or reading 
input surrounding the lexical items. 


Table 4. Levels of contextual richness and generativeness 


Score 

Level 

Interpretation 

1 

Verbatim use 

No generation: no demonstrated effort to integrate meaning. 
Learners repeat the text word for word. 


Verbatim context 

Repeated exposure to the same word forms, collocations or sentences 
through reading or listening (i.e., no new contextual information added). 

2 

Nonspecific use 

Low generation: very little effort to integrate meaning. Learners make 
structural changes to the target form. Very little elaboration. 


Nonspecific context 

The context does not direct learners to understand a precise or general 
word meaning (e.g., “What is trigger?”) 

3 

General use 

Reasonable generation: reasonable effort to integrate meaning. 
Reasonable elaboration on a word’s general properties and associations. 


General context 

The context provides clues about tbe semantic field or general category 
but not sufficiently to define precise properties of the word. 

4 

Specific use 

High generation: considerable effort to integrate meaning. 

Extensive elaboration on a word’s specific properties and associations. 


Specific context 

The context directs learners to a specific meaning that can easily be 
inferred. 


Note. Adapted from Joe (1995, p. 151) and Beck et al. (1983). 


Before applying this scale, all target word forms and their word family members were identified 
in class materials, copies of Zeki’s reading materials and his learning materials. Each instance 
was then coded by the researcher. 
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Results and Discussion 

The variables investigated are the number of days target words were encountered on the course, 
the richness of input surrounding the target words in authentic contexts, and the level of 
generative use in learner output. For transparency, the actual tokens of target words in listening, 
speaking, reading, and writing are presented in Tables 6, 10, and 13. 

Analysis is divided into three parts: (a) target words encountered solely through input, (b) target 
words used without any evidence of change from the original context (verbatim use), and (c) 
target words used with structural or semantic modifications. 

Target Words Encountered Solely Through Input 

Table 5 presents the results of Zeki’s vocabulary tests for five of the 20 target items met through 
listening or reading, that is, Zeki did not attempt to produce any of these items during the course. 
It shows he was unable to produce word associates or to use the words in context in either the 
pretests or the posttests. At best, he was able to provide a partial word meaning for one item in 
the posttest. 

Table 5. Test scores for words encountered solely through input 


Meaning Associates Use in context 

Target item 

Pretest Posttest Pretest Posttest Pretest Posttest 


disparity 

0 

0 

0 

0 

0 

0 

sacred 

0 

1 

0 

0 

0 

0 

launder 

1 

2 

0 

0 

0 

0 

compromise 

1 

1 

0 

0 

0 

0 

indifferent 

1 

1 

0 

0 

0 

0 


Note. Scores for meaning, from 0-5; associates, from 0^; use in context, from 0-4. 


Table 6 shows the level of contextual richness encountered in the input and the actual 
distribution of word tokens. Given the extremely limited number of meetings with words in 
listening or reading on the course, the results are probably not surprising. Minimal encounters 
through input alone did little to shift Zeki’s awareness of word knowledge or use beyond the 
original state for most words. 

Table 6. Level of contextual richness and distribution of words encountered solely 


through input 

Target item Verbatim use 

Non-specific 

General 

Specific 

disparity 

- 

2 X reading 

- 

sacred 

1 X reading 

1 X list 

- 

launder 

- 

1 X list 

- 

compromise 

1 X reading 

- 

1 X list 

indifferent 

- 

- 

1 X list 

Note. List = listening. 
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However, exposure to one or two tokens of the target item in reading or listening did result in an 
increase in incremental vocabulary knowledge for two words. As Table 5 shows, knowledge of 
the word form sacred increased between the pretest and posttest. In addition, knowledge of the 
meaning of launder shifted from being unknown to partly known. 

Table 7 shows the total number of days words were met on the course and the total number of 
tokens, that is, the times the word actually occurred. Unsurprisingly, encountering only one or 
two tokens over a total of 3 or 4 days over a course was insufficient to move words from an 
unknown state to the ability to articulate a precise word meaning and the ability to use words in 
context. 


Table 7. Total number of days and word tokens 


Target item 

Total number of days“ 

Total number of tokens^ 

disparity 

4 

4 

sacred 

3 

4 

launder 

3 

3 

compromise 

4 

4 

indifferent 

3 

3 


Note. “Including cases where the target item was seen once during pretests and posttests. 


Input and Verbatim Use 

Table 8 shows the vocabulary test scores for the nine target items that Zeki reproduced in his 
writing or speaking, using exactly the same form as that in the original context. Four of these 
items remained unknown throughout the course. Five were recognized in form only, and just one 
word’s general semantic properties were known. 


Table 8. Test scores for words encountered through input and used verbatim 


Target item 

Meaning 

Associates 

Use in context 

Pretest 

Posttest 

Pretest 

Posttest 

Pretest 

Posttest 

deceased 

0 

0 

0 

0 

0 

0 

dissuade 

0 

0 

0 

0 

0 

0 

intact 

0 

0 

0 

0 

0 

0 

thorough 

0 

0 

0 

0 

0 

0 

intend 

1 

1 

0 

0 

0 

0 

undermine 

1 

1 

0 

0 

0 

0 

rapport 

0 

1 

0 

0 

0 

0 

empathy 

0 

1 

0 

0 

0 

0 

dissolve 

1 

2 

0 

0 

0 

0 


Note. Scores for meaning, from 0-5; associates, from 0-4; use in context, from 0^. 


Table 9 shows that most words occurred on only 1 or 2 days of the course, although half of them 
occurred with greater frequency. Encountering items on at least 4 days and at least six times over 
the course enabled Zeki to recognise the word forms intend, undermine, rapport, and empathy in 
the posttest, but this number of exposures was not sufficient to produce greater knowledge of 
word meaning. 
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Table 9. Total number of days and word tokens 


Target item 

Total number of days“ 

Total number of tokens^ 

deceased 

3 

10 

dissuade 

3 

6 

intact 

3 

4 

thorough 

4 

7 

intend 

8 

9 

undermine 

5 

6 

rapport 

4 

8 

empathy 

4 

11 

dissolve 

4 

5 


Note. “Including cases where the target item was seen once during pretests and posttests. 


Table 10 shows Zeki’s actual encounters with words that he used in speaking or writing without 
any modification from the original context. Let us examine in more detail how he used these 
words. 


Table 10. Level of contextual richness, generativeness, and distribution of words used verbatim 


Target item 

Verbatim use 

Non-specific 

General 

Specific 


1 X writ 




deceased 

1 X collocation writ 

3 X reading 


lx list 

1 X spk 



1 X collocation spk 




dissuade 

1 X mng writ 

1 X list 


2 X list 

intact 

1 X collocation writ 

1 X reading 



tborougb 

2 X mng writ 

3 X reading 



intend 

1 X writ 

6 X reading 



undermine 

1 X collocation writ 


3 X reading 


rapport 

1 X mng writ 
1 X writ 


2 X reading 
1 X list 

1 X list 

empathy 

1 X writ 
1 X writ 

2 X reading 

3 X reading 

2 X reading 

dissolve 

1 X mng writ 


1 X reading 

1 X list 


Note. List = listening, spk = speaking, writ = writing; mng = meaning. 


The pretest and posttest scores remained the same for three items that Zeki studied directly for a 
weekly vocabulary test and met in contexts that provided no clues about the words’ general or 
precise meanings. Zeki encountered intact and thorough up to three times over 1 or 2 days while 
reading or studying vocabulary, but he failed to recognize the word forms in the posttest 2 
months later. In addition, Zeki was familiar with the word form intend but was unable to provide 
a general meaning in the posttest after encountering the word seven times over 6 days. 

Two items, rapport and empathy, are worthy of comment because they were embedded in rich 
written contexts and were studied directly for vocabulary tests. Including these words in his 
vocabulary study shows that Zeki noticed a gap in his lexical knowledge and wanted to further 
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his vocabulary knowledge, yet he was unable to provide a plausible associate, sentence or a 
general meaning for these words. At best, he recalled having seen the words previously. 

As the excerpt below shows, Zeki noticed the word rapport in text and asked his teacher about 
its meaning. However, over-elaboration may have prevented Zeki from getting a clear sense of 
the meaning of rapport. That is, wordy explanations may not assist with clarifying word meaning 
(Chaudron, 1982). The turn ends with the teacher, and Zeki does not attempt to clarify, extend, 
or elaborate on the word meaning at all. 

Zeki: is it close to telepathy 

Tl: yes and no some people would say yes but other people would say no. um it can come from 
just understanding someone very well, because you know them very well OK. for example my 
husband I. like last night we were at the supermarket and we said the same thing at the same time, 
it’s not because we’re telepathic it’s just because we know each other very well OK. and that’s a 
kind of rapport, um but you can also meet someone and right away you have a rapport with them 
so some people would say that’s telepathic [21/3 Class] 

Unlike rapport, empathy was richly elaborated on within written text. It was also recorded for 
vocabulary study, translated four times in writing, and used in the simple, non-defining sentence, 
“She has empathy.” Empathy was salient because it was central for text comprehension. It 
occurred seven times and was clearly defined in a “directive” context (Beck et ah, 1983). Despite 
the very rich context in which the word was embedded, Zeki could not provide a word meaning. 
For both these items, an incremental shift in vocabulary knowledge was limited to greater 
awareness of word form. This gradual vocabulary growth is not to be dismissed. These examples 
illustrate how demanding vocabulary learning is. Multiple opportunities were needed, ranging 
from incidental exposures to rich word meanings, direct vocabulary study, frequent occurrences 
with the word in a single text, or noticing and comparing a new word with one that was known. 

Another point to consider is that the elusiveness of rapport and empathy may relate to inherent 
difficulties underlying the semantic properties of the words themselves. Because of the higher 
conceptual burden, Zeki may have needed further meetings with these words over the course to 
consolidate semantic and formal features of the words. A similar case could be made for the item 
intend. Although it was encountered on 8 different days across the course, it too proved to be 
elusive. 

Input and Modified Output 

Table 1 1 shows the vocabulary scores for words that were modified structurally or semantically. 
Six target words were elaborated on in writing or speaking. As is predicted in a depth of 
processing framework, the two words displaying the highest level of semantic elaboration were 
those that Zeki could define most precisely in the long term. 

What is striking about all the words that were changed from the original context, whether slightly 
or much more substantially, is evidence of the ability to use the words in context. Four of the six 
words that scored 0 for evidence of use in the pretest increased to the point where plausible 
associates were provided, and an attempt was made to use the word in delayed tests. 
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Table 11. Test scores for modified words 


Target item 

Meaning 

Associates 

Use in context 

Pretest 

Posttest 

Pretest 

Posttest 

Pretest 

Posttest 

intensive 

1 

1 

0 

2 

0 

1 

determine 

1 

2 

0 

2 

0 

2 

advocate 

1 

3 

0 

2 

0 

1 

chronology 

3 

3 

1 

2 

1 

2 

eligible 

2 

2 

2 

2 

2 

3 

weak 

1 

3 

0 

4 

0 

2 


Note. Scores for meaning, from 0-5; associates, from 0-4; use in context, from 0^. 


Table 12 shows the overall frequency with which Zeki met items used in a different way from 
the original. What is striking is that words used in a novel way were met much more frequently. 
They were distributed across the course over at least 8 days and occurred at least 12 times. 


Table 12. Total number of days and word tokens 


Target item 

Total number of days'* 

Total number of tokens'* 

intensive 

9 

23 

determine 

11 

19 

advocate 

10 

18 

chronology 

8 

19 

eligible 

10 

19 

weak 

9 

12 


Note. Tncluding cases where the target item was seen once during pretests and posttests. 


Of all the target items, the one with the highest number of tokens was intensive, half of which 
were recycled within one task. In a class debate about immigrant children and teenagers being 
required to take an intensive English language course before entering regular schools, Zeki used 
the phrase “intensive English language course” twice and listened to classmates using it 11 times. 

While the score for word form familiarity remained the same at both the pretest and posttest, the 
word associates score increased to the point where Zeki could automatically produce a plausible 
associate within one context in the posttest. The ability to produce a memorised phrase but not to 
supply a general or precise word meaning is in line with findings from research on formulaic 
phrases or multiword units, which show that learners can produce chunks of language to aid 
fluency (Pawley & Syder, 1983) without necessarily having analysed each part (McNeill, 1996). 

Table 13 shows the levels of contextual richness and generativeness for modified words. Three 
important features can be noted; (a) a high frequency of distributed occurrences with items 
across the course, (b) opportunities for frequent meetings with items in newspaper articles related 
to a topic of Zeki’s choosing {What are the arguments for and against compulsory 
superannuation in New Zealand?), and (c) the variability of contexts from which word meanings 
could be inferred. There was also evidence of direct vocabulary study and translations for all 
words except eligible. 
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Table 13. Level of contextual richness, generativeness and distribution of word tokens (modified 
words) 


Target item 

Verbatim use 

Non-specific 

General 

Specific 

intensive 

1 X writ 

5 X reading 

5 X list 

lx list 


1 X spk 

1 X writ 
6 X list 
1 X spk 

1 X spk 


determine 

4 X writ 

2 X reading 

8 X reading 
1 X writ 

2 X reading 

advocate 

1 X mng writ 


4 X reading 

1 X reading 


2 X collocations writ 


1 X writ 

1 X writ 


6 X writ 


(errors) 

(errors) 

chronology 

1 X list 

1 X reading 

1 X list 

2 X reading 


5 X spk 

1 X list 

1 X mng spk 

2 X spk 



2 X spk 


1 X mng spk 

eligible 

1 X mng & 

6 X reading 

5 X reading 

2 X reading 


collocation writ 


3 X spk 


weak 

1 X collocation writ 

4 X reading 

2 X reading 



2 X writ 

1 X collocation writ 




Note. List = listening, spk = speaking, writ = writing; mng = meaning. 


The distribution of tokens for writing and speaking shows there is more general and specific 
generative use for the items advocate, chronology and eligible, which happen to have better 
vocabulary test scores than intensive and determine. 

While there is no record of Zeki using weak with high levels of generative processing in 
speaking or writing, he was observed reviewing weakness on his word card ring after having read 
a passage from a book about the impact of economic reforms in New Zealand. It would appear 
that he noticed a lexical gap or wanted to quickly confirm the word meaning. The metacognitive 
act of noticing the word, possibly evaluating its lexical status, guessing, and confirming the word 
meaning suggests deep level processing may have occurred. 

An important point to note with the items determine, advocate, chronology, and eligible is that 
they were recycled within and across different texts focusing on specialist topics of Zeki’s 
choosing, that is, economics, history, politics, and law. He not only had multiple opportunities to 
see target words in contexts of interest, he also used these words in novel ways. Below are 
examples of sentences he generated for vocabulary study and vocabulary tests. 

I thought because I read very quickly 1 thought he said if taxes er wouldn’t rise in the future ah do 
you the age of elibility eligibility rise to 17 in tbe future 1 thought but it was wrong was wrong 
understanding [9/4 Interview] 

Minister of finance has been advocated all of his colleagues and the PM in the House of 
Representatives [20/3 Vocab study] 

He’s advocated by a large group of lawyer (LI) [2/4 Vocab cards] 
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His productive use of advocate, generated independently in the examples above, is validated by 
his performance in the posttest, as shown in the extract below. Zeki often read his wife’s law 
essays, so he was exposed to legal terms. This may account for the narrower definition given for 
advocate “to defend a client” rather than the wider sense, “to speak publicly in support of 
something.” 

Zeki: advocate, defend something, defend mm. like lawyer advocate, a lawyer advocates mm. 
[switches to the next target item] 

Researcher: so that’s a new one. if we go back to advocate you said before, that you defend 
something and like a lawyer would defend something, can you give me a sentence using the word 
advocate, or advocate 

Zeki: isn’t advocate a name of the lawyer, an advocator, what can I say advocate, it doesn’t mean 
it is an idea and you support it. if you defend or advocate something ah. it is like a. business or to 
educate your company’s, company’s pattern or. method yeah advocate advocate it’s the same 
advocate [18/6 Interv] 

We can see that Zeki evaluates how well the example fits with an ideal sentence illustrating the 
precise meaning intended. This does not quite hit the mark, but in terms of depth of processing 
theory, what is important is the cognitive effort involved in comparing, evaluating, and 
integrating old and new knowledge, that is, the transformation and generation of new knowledge. 
A key point is that cognitive processing is primary; accuracy relating to the learning product is 
secondary. This example is characteristic of his cognitive effort and approach to vocabulary 
perceived as relevant to his central interests. 

As alluded to previously, words that had been modified in some way were sourced mainly from 
self-selected tasks: (a) words selected for direct vocabulary study, (b) words encountered while 
reading news articles on a class project of his own choosing, and (c) reading books, essays or 
internet articles related to his interests. This finding positively supports the view that learner 
input into decision-making processes enhances vocabulary acquisition. 

It reinforces the importance of greater need, search, and evaluation (Laufer & Hulstijn, 2001) to 
acquire new words. An element of autonomy in the selection of words and topics studied in 
sustained tasks can lead to higher levels of motivation because of greater learner interest and 
ownership of the task. Active engagement in the process of reading and choosing relevant 
extracts or lexical items for projects or personalised vocabulary programmes also required 
evaluation of the usefulness of the items, for example, when deciding on whether a word is worth 
studying for receptive or productive purposes. Each instance of encountering a word served to 
promote retrieval of word form, at a minimum, and potentially to notice how words were used in 
context. 

As we have seen in this section, exposures to words accrued and contributed to Zeki’s ability to 
generate novel sentences. Let us now turn to the extent of the cumulative encounters that Zeki 
had with target words over the course. 


Reading in a Foreign Language 22 ( 1 ) 



Joe: Encounters with vocabulary in an English for Academic Purposes programme 


133 


Frequency of Encounters 

Table 14 summarises the average number of days that target words were encountered over the 
course and the average number of tokens according to Zeki’s comprehensive word knowledge 
and ability to use the items in context. 

Table 14. Pretest and posttest status of target words 


Mean number Mean number Number of 
of days of tokens words 


• Unknown form, meaning, and use 

3 

6 

5 

• Familiar word forms only with no 

C 

5.5 

4 

evidence of plausible word use 

J 

• Minimal increase in word form 

A 

7.5 

'X 

familiarity only 

4 


• Increase in word meaning with little or 

3.5 

A 

9 

no evidence of use 



• Unstable meaning or use 

0 

0 

0 

• Increased evidence of plausible use 

9.5 

18 

6 


By classifying Zeki’s 20 target words according to various states of vocabulary knowledge 
between pretests and posttests, then averaging the days’ encounters associated with each state, 
we can identify key trends related to states of knowledge and frequency. Table 14 clearly reveals 
that frequent distributed meetings (9.5 days) with words averaging 18 tokens across the course 
were necessary for Zeki to use target words in plausible sentences. 

While words reported as unknown in all aspects of vocabulary knowledge were encountered on 
an average of 3 days across the course, with an average of six tokens, words used plausibly in 
illustrative sentences occurred three times as often. Zeki needed repeated exposures and 
opportunities to produce the words in order to generate meaningful and grammatically accurate 
sentences. As previous researchers (e.g., Sternberg, 1987, p. 92; Zahar et ah, 2001) have stated, 
learning is more likely to occur when unknown words are met frequently across variable contexts. 

That is not to say however that fewer occurrences with words over the course were unproductive. 
Encounters with words over 4 days of the course with as few as four instances resulted in 
incomplete vocabulary growth, evident in the partial ability to recognise word forms, retrieve 
word meanings, and attempts to use words in sentences. Obviously the number of meetings 
needed to shift vocabulary knowledge, and the ability to use words productively, is influenced by 
other factors such as the underlying conceptual difficulty of the words themselves, cognates, 
opportunities for use, and the learner’s own purpose. 

Bearing in mind that Zeki began the course with a receptive knowledge of about two-thirds of 
the second thousand words and over a third of the University Word List, it is likely that greater 
numbers of exposures to words were required over the course because of greater gaps in his 
lexicon. If he had started off with a greater breadth of vocabulary knowledge, he would have had 
richer associations within his existing vocabulary networks, thereby lessening the learning 
burden. 
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Quality of Input 

Other researchers (e.g., Beck et ah, 1983; Zahar et ah, 2001) investigating the effect of 
contextual richness on vocabulary acquisition have found acquisition to be a function of 
frequency, not contextual richness based on incidental exposure to reading alone. Table 15 
shows a comparison of mean scores for contextual richness and level of generative output. We 
can see there is little difference between words associated with high, medium or low vocabulary 
test scores and the ability to infer the precise meaning of words from reading and listening texts 
(see Appendix A, B, and C for detailed analyses). This is in line with previous findings. 


Table 15. Mean for the richness of contexts and level of generative output 



Mean context (range 0-4) 

Mean generativeness 

Highest scoring words 

2.54 

1.97 

Medium scoring words 

2.87 

1.55 

Lowest scoring words 

2.83 

0.57 


Table 15 also shows that the higher vocabulary scores are associated with greater levels of 
generative use. As we saw with words that were modified and learned best, there was a tendency 
for words learned better to have been used with reasonable or high levels of generation. However, 
this was coupled with frequent encounters with words across the course. 


Conclusion 

What is noticeable about the words that were unknown in all aspects or were reported to be 
familiar in both the pretest and posttest are fewer opportunities for input and less evidence of 
noticing. 

Although Zeki’s test results show the greatest effect for richly elaborated words and for frequent 
distributed meetings with related texts, it would be wrong to dismiss the contributions that tasks 
such as verbatim copying and intensive, massed encounters had on incremental vocabulary 
development. The results do reveal that vocabulary is cumulative with a shift from no knowledge 
to perceived word form familiarity. We saw how noticing a word, having opportunities for 
focused practice and encountering words over 4-6 days over a distributed period moved Zeki’s 
vocabulary development incrementally, even though it was below his optimum threshold of 18 
tokens over 9 or more days. 

To sum up, the majority of words that Zeki retained well long-term were met extensively across 
the course and involved greater levels of processing. Embedding words in rich, instructive 
contexts on its own did not contribute to better opportunities for vocabulary learning. It needed 
to be coupled with noticing and frequent meetings over a distributed period to improve 
vocabulary development. We have seen that vocabulary acquisition is indeed an incremental 
process, requiring multiple encounters with new or partially known words in a wide range of 
tasks. 
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Notes 

1. Refer to Joe (2006) for a detailed description of the word recognition measure. 

2. This study applied Stahl and Fairbanks’ (1986, p. 75) definition of depth of processing: 
“decisions that require more mental effort, or require greater amount of available cognitive 
resources.” The depth of generative processing was operationalised by evidence of learners 
comparing, evaluating, and integrating new words with known words in tasks and by the extent 
of semantic elaborations observed in output (refer to Table 4 for descriptors used to measure 
different levels). 
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Appendix A 

Descriptive Statistics for Best Acquired Target Words 

(Ratings and means for richness of contextual input and level of generative output) 


Most acquired 
words 1 + 

Test 

score 

Tokens Tokens 
in out 

Context rating 

Mean 

context 

Generative 
rating (gen) 

Mean 

gen 

chronology 

33 12 12 

6 

11 

1, 2, 2, 3, 4, 4 

2.6 

1, 1, 1, 1, 1, 2, 2, 
3, 4, 4, 4 

2.18 

eligible 

22 22 23 

13 

4 

2, 2, 2, 2, 2, 2, 3, 3, 

3, 3, 3, 4, 4 

2.69 

1,3, 3, 3 

2.50 

weak 

13 04 02 

6 

4 

2, 2, 2, 2, 3, 3 

2.33 

1,1, 1,2 

1.25 

Mean 


8.3 

6.3 


2.54 


1.97 


Appendix B 

Descriptive Statistics for Partly Acquired Target Words 

(Ratings and means for richness of contextual input and level of generative output) 


Total test 
score 4-6 

Test 

score 

Tokens 

in 

Tokens 

out 

Context rating 

Mean 

context 

Generative rating 
(gen) 

Mean 

gen 

determine 

12 02 02 

12 

5 

2, 2, 3, 3, 3, 3, 3, 3, 3, 

3 

1, 1, 1, 1,3 

1.4 





3,4,4 




advocate 

13 02 01 

5 

11 

3, 3, 3, 3, 4 

3.2 

1, 1, 1, 1, 1, 1, 1, 

1.45 







1, 1,3,4 


intensive 

11 02 01 

17 

5 

2, 2, 2, 2, 2, 2, 2, 2, 2, 

2.41 

1, 1,2, 2,3 

1.8 





2, 2, 3, 3, 3, 3, 3, 4 




Mean 


11.3 

7 


2.87 


1.55 
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Appendix C 

Descriptive Statistics for Least Acquired Target Words 


Descriptive statistics for target words acquired the least: Ratings and means for richness of contextual 
input and level of generative output. 


Total test 
score 0-2 

Test score 

Tokens 

in 

Tokens 

out 

Context rating 

Mean context 

Generative 
rating (gen) 

Mean 

gen 

rapport 

01 00 00 

4 

2 

3, 3, 3,4 

3.25 

1, 1 

1 

dissolve 

12 00 00 

2 

1 

3,4 

3.5 

1 

1 

indifferent 

11 00 00 

1 

0 

4 

4 

0 

0 

empathy 

01 00 00 

7 

2 

2, 2, 3, 3, 3, 4, 4 

3 

1, 1 

1 

undermine 

11 00 00 

3 

1 

3,3,3 

3 

1 

1 

compromise 

11 00 00 

2 

0 

4,2 

3 

0 

0 

launder 

12 00 00 

1 

0 

3 

3 

0 

0 

deceased 

00 00 00 

4 

4 

2, 2, 2, 4 

2.5 

1, 1, 1, 1 

1 

dissuade 

00 00 00 

3 

1 

4, 4,3 

4 

1 

1 

intact 

00 00 00 

1 

1 

1 

1 

1 

1 

thorough 

00 00 00 

3 

2 

2, 2,2 

2 

1, 1 

1 

sacred 

01 00 00 

2 

0 

2,3 

2.5 

0 

0 

intend 

11 00 00 

6 

0 

2, 2, 2, 2, 2, 2 

2 

0 

0 

disparity 

00 00 00 

2 

0 

3,3 

3 

0 

0 

Mean 


2.92 

1 


2.83 


.57 
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